Software Tools for Robust Analysis of High-Dimensional Data

نویسندگان

  • Valentin Todorov
  • Peter Filzmoser
چکیده

The present work discusses robust multivariate methods specifically designed for high dimensions. Their implementation in R is presented and their application is illustrated on examples. The first group are algorithms for outlier detection, already introduced elsewhere and implemented in other packages. The value added of the new package is that all methods follow the same design pattern and thus can use the same graphical and diagnostic tools. The next topic covered is sparse principal components including an object oriented interface to the standard method proposed by Zou, Hastie, and Tibshirani (2006) and the robust one proposed by Croux, Filzmoser, and Fritz (2013). Robust partial least squares (see Hubert and Vanden Branden 2003) as well as partial least squares for discriminant analysis conclude the scope of the new package.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data

Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...

متن کامل

Methods for regression analysis in high-dimensional data

By evolving science, knowledge and technology, new and precise methods for measuring, collecting and recording information have been innovated, which have resulted in the appearance and development of high-dimensional data. The high-dimensional data set, i.e., a data set in which the number of explanatory variables is much larger than the number of observations, cannot be easily analyzed by ...

متن کامل

Robust Gaussian Graphical Modeling with the Trimmed Graphical Lasso

Gaussian Graphical Models (GGMs) are popular tools for studying network structures. However, many modern applications such as gene network discovery and social interactions analysis often involve high-dimensional noisy data with outliers or heavier tails than the Gaussian distribution. In this paper, we propose the Trimmed Graphical Lasso for robust estimation of sparse GGMs. Our method guards ...

متن کامل

Robust Estimation in Linear Regression with Molticollinearity and Sparse Models

‎One of the factors affecting the statistical analysis of the data is the presence of outliers‎. ‎The methods which are not affected by the outliers are called robust methods‎. ‎Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers‎. ‎Besides outliers‎, ‎the linear dependency of regressor variables‎, ‎which is called multicollinearity...

متن کامل

Simulation of Smoke Emission from Fires in High-Rise Buildings Using the 3D Model Generated from 2-Dimensional Cadastral Data

Having a 3-Dimensional model of high-rise buildings can be used in disaster management such as fire cases to reduce casualties. The fundamental dilemma in 3D building modeling is the unavailability of suitable data sources. However, available cadastral 2D maps could be used as low-cost and attainable resources for 3D building modeling. Smoke will be a great threat to people's health during a f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014